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COMPinER NETWORKING: APPROACHES TO QUMJTy SERVICE ASSURANCE 



Roma B. StilLiBn 

Ihe ptroblem of quality service assurance in a (gen- 
eralized) connputer networking envdronnent is addressed. In 
the absence of any direct, well-defined, quantitative meas- 
ure cf ser/ice quality and reliability, error collection 
and analysis is the only basis for service quality control* 
Therefore, mechanisms are escribed v^ch facilitate report- 
ing of operational errors > documentation of error oorrec- 
ticns, and collection of system perfoniBiX!e data. Since 
techniques for hardware quality control are well known, 
these mechanisms focus on collecting data which can be used 
to assess and control softwsa:« quality. Finally, specific 
network facilities are described vMch support research in 
l^e area of softvgare quality, and potential areas of new 
research using the network are identified. 

Key words: Ccnpiler, ccnputer network, documentation, 
d^WDidc software analj'sis, interpreter, quality control, 
software testing9 software verification, static software 
analysis, structured progr^mining, system errors, system 
perfonnance, theorem-proving. 



1. INTRODUCTION 

Although the goal of reliable, fail-soft network service at reason- 
able cost suggests certain concepts in network design (e.g. , acentric 
rather than st ir network, process rather than processor orientation, 
etc.), this report will, as far as is possible, be indepsndent of the 
details of any particular network philosophy or topology. We do 
assume, however, that all resource providers concur in the belief that 
user satisfaction is the primary goal of the network, and will, 
therefore: 

(1) require that progrannters document the results of their work. 

(2) assign responsibility for assuring the quality of user service 
to a designated group of experts. In particular, a responsible 
individual must be identified for each software module. 

(3) exhibit complete honesty in acknowledging failures and tracking 
doMn their causes (although information bbtcuned from users and 
independently collected by the network will provide a check on 
this). 

(H) abide by network rules and conventions, which are designed to 

minimize t^ effects of system failures on users and to encourage 
stable, reliable service. 



A primary factor in the success of any network is user satisfac- 
tion. TVk) ijuportant causes of user dissatisfaction are: isolation, 
i.e., lack of access to consultation and application expertise, no 
established channel through vMch to report system errors to the re- 
sponsible engineers, and unavailability of data on network perfcounce 
in genereil, and individual subsystem perfomence in particular; and poor 
quality service, e.g., chaotic service, frequent system crashes, un- 
stable data and pzograns, inaccurate and/or inadequate documentation, 
bug-laden and p^unctorily maintained system software. At the same 
time, resource providers find it difficult to correct and maintain their 
systems without sufficient feedback infomation from users. 

To alleviate these problems in networidng, we suggest establishing 
"network central," a technical-administrative office of the network. 
(Note that, in fact, network central need not be a single organization 
in the network operations management stmocture. For reasons of con- 
venience, and because we are concerned with Ihe functions rather than 
the implementation of network central, we refer to it as a single 
entity. Further informatiai on network management structure and irnple- 
mentation can be found in [12].) Network central will serve as an in- 
formation center to users, providing guidance and consultation on 
specific problems. It will serve as the point of contcct between the 
user and the network. In particular, network central will channel com- 
plaints frcm users to the appropriate host nodes, and will oorwey 
reports of system modifications from host nodes to users. It will cdso 
be responsible for madntaining records on system performance and service 
quality, and will establish and enforce network quality control pro- 
cedures. Within -tiiis context, we will describe generEdized mechanisms 
for: 



(1) constmcting perfonnance profiles for indr/idual subsystems 
in the network, and for the network as a whole; 

(2) defining and implementing quality control procedures far 
network systems; 

(3) using the unique environment provided by the networic to 
e}q)eriment with and evaluate new techniques for increasing 
software reliability. Specific networSc facilities to s^>- 
port software research will be described, and potential 
areas of research using the network will be identified. 



2. SYSTEM PERPOBmNCE MEASUFEMEMT 



In order to assist users and network managers in analyzing the per- 
formance of a distributed network or of individual network subsystems 
over time, and to permit the perfooBnce of different subsystems to be 
oonpared, network central must maintain a file of system performance 
profiles. Ihe file would consist of periodic (e.g., weekly, monthly, 
quarterly) performance profiles for each subsystem, provided by the host 
node, as well as performance profiles for the network as a whole pro- 
vided by network central (see Figure 1). 
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Figure 1: Madntenanoe of File of System PerfomBnce Profiles 
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A System Perfonnance Ftofile should include: 

SYS^ liUli'lFiCAnON 

FEPORTING PERIOD 

TIME SCHnUIiD 

TOTAL DOWN TIME 

FERGEFTT OF TIME DOWN 

UrTERRUPTIONS TO SERVICE (NUMBER) 

(CAN TDC BEmOf FAILURES (Hi'BF) 

ICAN TIME TO REPAIR (MTm) 

ARRASf IXSOaBING THE BTIERRUFTIONS TO SERVICE: 

TCTE mSER TIME LOST MTBF MTTR 

HRRDkARE 

SOFIWARE 

CXMUNICATIONS 

INVIRONMElfr 

mCIASSIFIED 
UNKNOWN 
MISCEIl/iNEOUS PR0BU16 



M interruption to service is any type of system failure which aborts 
execution of prograins being rvn by a majority of the systan's current 
users, or which causes a suspension of system operation for msre than 
X (e.g. , five) minutes regardless of whether any jobs are aborted. 
For exanple, loss of electrical power at a host node is clearly an 
intemiption to that system's service. Alternatively, any catastrophic 
etror vAiich necessitates a ocmplete syston dump foUowed by a restart 
constitutes an interruption to service. Moreover, if a disk pack is-— . 
being moved from a faulty disk unit onto a replacement, and if all 
system activity is suspended for more than X minutes vtdle the exchange 
takes place, an interr^iption to service is recorxied. In any event, all 
problems, vdiether they cause interroptions to service or not, are re- 
corded (as miscellaneous problems) in 1iie system's performance r«corx3s. 

Interruptions to service are further identified, vdienever possible, 
as to cause, e.g., hardware, software, oonmunications. BivirxamiBntal 
failures include power failures, air conditioning equipment failures, 
etc. Hunan failijres are usually procedural errors nede dm-ing operas 
tion, e.g. , the operator pushing the wrong button at a critical moment. 
IMclassified failures are those for which the innediate syn^ytcm is 
known (e.g., every other word in raenory dropped bit 15), but the under- 
lying cause (hardware or software) is not. Unknown failure are those 
vduch, despite considerable analysis, defy e3q)lanation. The system 
performance profile is essentially identical to surniaries which have 
been used successfully on the Dartmouth Time Sharing Systan [11] and 
on Bell's No. 1 Electixxiic Switohing System Cl], 

Clearly, the host node is responsible for tlie accuracy, con?)lete- 
ness, and -imeliness of his system's profiles,. As in accounting, it 
becomes very difficult to oonpare results between one time period and 
the next, or between one subsystem and another, if the recording niles 
vary. Iherefore, it is essential that tJie recording rules be strictly 
and uniformly obeyed throughout the network. 

The resource supplier, however, should not be the sole sourw of 
information on his system's performance. Independent sources can pro- 
vide ocraplementary and, to some extent, redundant information, and 
thereby serve to "keep the profiles honest." One such source is the 
network itself. By polling the activity of the subsystems at regular 
intervals, the network can derive its own gross profile of subsystem 
performance (e.g. , up time vs. down time). A more important source 
of information is the formal me<±3nism provided by the network (and 
descril>ed in the next section) tlirough viiich users can vepovt the 
details of any operational difficulties they encounter. 



3. QUALTIY CONTROL PROCEDURES 



3.1 Trouble Reporting Procedures: Ihe Operational Errors File and the 
Coirectians File 

ultimate source of informaticn about the quality of networic 
service is the user. Therefore, it is iJii)ortant to establish a mechan- 
ism which assures that system problems discovered by users are properly 
documtnted and routed to the responsible engineers. Vfe suggest that this 
nechanism consist of a set of troiible reporting procedures and two files 
naintained by network central called the Operational Errors File and the 
CorrevTtions File. When users e3q)erience operating difficulties with 
the network, they describe the problem (over the phone) to a group at 
network central that is responsible for user support. If the problem 
cannot be identified as procedural, or cannot otherwise be deterroined 
to be user caused, and if the problem is not a di?)licate of one that has 
been reported previously, aii Operational Error Report is generated by 
network central and entered into the Operational Errors File. Every 
Operational Error Report is assigned a unique identification number, 
cuid includes: 

n^ROR lEENTIFICATION NUMBER 
ME OF ERROR REPORT 
OOMPIAENANT lEEIITIFICATION 
HOST NOK: lEEUnFICATION 
PARHCULAR SERVICES INVOLVED 
PKDBLEM EESCRIPTIQN: 

E^, TIME OF DIFFICULTY 

WHAT WAS DONE 

RESULTS EXPECTED 

RESUUTS OBTAINED 

COPY OF PROGRAM, DATA (v*ien appropriate) 
LIST OF SUBSEQUENT OOMPIAINANTS , DATES 

Network central then alerts the appropriate specialists at cm host node 
(a list of services provided and engineers responsible for them is main- 
tained at network central) to the relevant Operational Error Report. 
When a solution has been found, the host specialists generate a Correc- 
tions Report, vMch includes: 

CORRECTION ILENnFICATION NUMBER 
DATE OF CORRECTION REPORT 
HOST NODE IDimnCATION 

OPERATIONAL ERROR(S) ADDRESSED (i.e., list of Error ID's, 
if any) 

SOURCE OF ERROR (Hardware, Software, etc.) 

ERROR EESCRIPTICN (e.g., mcxiule in which Software error 

occurred, statements involved, etc.) 
DATE OF CORRECTION 
CORRECnON DESCRIPTION 

DESCRIPTION OF REGRESSION TESTS PERFORMED (type and extent) 
DOCUMENTATION CHANGES (if appropriate). 

5 



The correction description is a detailed account of What was done, where, 
and v*iy it was done, e.g., in the case of a software correction, which 
statements in v*iich nodules were added, deleted, or altered, and an 
e3q)lanation of v*)y this method of correction was deemed 25>propriate. All 
efforts to verify that the correction has not introduced new errors are 
detailed in the description of regression tests perfonned, e.g., satis- 
factory execution of a standard set of tests, use of software testing 
tools to construct new tests to exercise the software, etc. Networic cen- 
tral enters the Correction Report into the Corrections File, and maintains 
tables cross referencing the Operational Errors File and the Corrections 
File. Data on user-caused problems (e.g., misinterpreting the documenta- 
tion, not having documentation) my be recorded separately by networic 
central, for use in later analysis on the effectiveness of training ses- 
sions, the quality and availability of documentation, etc. 

This system of reporting carplaints and repairs is particularly 
appropriate in a naticml network environment, where the user would 
otiiervdse have no direct contact with system designers and engineers. 
Moreover, the files reveal not only how well the network and its com- 
ponent subsystems are behaving, but also how well they are being main- 
tained (see Figure 2). 





NETWORK 
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USER 




Figure 2: Maintenance of Operational Errors and Corrections 

Files 



Legend: 



Read/Write 
Read only 



— ► Connunications Path (e.g., phoie, 
'Written records, on-line files, 
etc.). 
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3.2 System Change Procedures 

In order to try out modifications to a software module witiiout^ 
undue risk and to test corrections to existing problems, both official 
and experimental versions of the modxale should be offered on the net- 
work. Users of the experimental versions would be warned that they do 
so at their own risk, and would, therefore, be expected to protect 
their data and programs before proceeding. Modules that cannot be 
debugged as e3q)erimental versions - either because they require com- 
plete control over the hardware or because of seme other characteristic 
that militates against more than one such module operating on the net- 
work at any time - and changes in hardware can be checked out during 
regularly scheduled experimental periods, during vMch service to the 
user coranunity is not guaranteed (Sundays, holidays, and titiird shift 
hours are prime candidates for experimental periods). 

System Perfomance Profiles, Operational Error Reports, and Cor- 
rection Reports should be generated during all experimental periods, 
and no change should be adopted until it has been sham to operate 
successfully under experimental conditions. Furtheimore, changes to 
the documentation shoild occur concurrently with any systtan changes. 

U. EXPERIMEOTAL RESEARCH ON IHE NETWORK 

Hie purpose of a national oonpiter-based network is to promote the 
sharing of resources of all types: hardware fertilities, software 
facilities, data bases, and human e3q}erience. As such, it fosters a 
climate Ijhat is conducive to research in general, and to research in 
the area of software development, testing, and validation in particular. 
Ihe benefits of providing new software services on the network as 
opposed to doing so at an isolated installation include: 

(1) r^pid exposure of the service to a large group of users who can 
be expected to exercise it tiiorougihly. Because of the diversity 
of their interests, experiences, and biases, their' reactions 
(e.g., the Operational Errors File, special interest group 
camunications) should provide more reliable data more quickly 
than is obtainable otherwise. 

(2) sharing the cost of developing new services among many users, 
which permits a wide variety of services to be offered, and 
which encourages innovation. 

(3) discouraging the "not invented here" syndrome, by involving a 
larger portion of the data processing ccnfiunity earlier in 
the development of software projects. 

(If) encouraging and facilitating meaningful connunications within 
t^ie research cannanity. 
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U.l specific Networic Facilities to Support Software Research 

Maiy of the tools vMch facilitate effective software production 
already exist, but not in a ccannon enviroranent where they can be 
effectively utilized by a broad spectrum of progranmers. A distributed 
network could provide the frcunework for a software production labora- 
tory, vrtiich would inc].ude a. software library, conpatible interpreters 
and ccnopilers offering sopihiisticated debugging and optimization fea- 
tures, program execution monitors and test data generators, verifica- 
tion condition generators for a variety of languages, and theorem- 
proving programs. The software production laboratory would serve, 
therefore, both to facilitate further software research and as ian 
environment in which users could produce better programs more 
efficiently. 

U.1.1 The Software Library 

A software library is a collection of programs and data which are 
of general interest and utility. The library, of course, need not 
reside at a single installation, but may be distributed over the net- 
wrk. What distinguishes the software library fran other programs 
shared over the network is that their reliability and oonforuBnce with 
explicitly defined standards is, in sane real sense, guar^teed by the 
network. That is, network central will establish stringent require- 
ments for entering a program into the library (e.g. , that it has run 
under experimental conditions for a given amount of time with an 
acceptably low frequency of Operational Errors) and for maintaining it 
once it belongs to the library (e.g. , a daily resolution of ttie ccm- 
plaints in the Operational Errors File) . By taking programs fran the 
software library, network users can avoid duplicating each other's 
efforts in writing and debugging connonly needed routines. 

The following types of programs are prime candidates for inclusion 
in a network software library: 

(1) Mathematical Function Routines: These are routines vMch 
conpite the trigonanetric functions and other ccranonly used 
functions (e.g., Bessel functions, Ackermann's function, 
etc.) with seme prescribed degree of accuracy, and which 
perfonn customary nathematical operations (e.g. , linear 
regression analysis and the like). Because these programs 
have been thoroughly tested and are vigilantly maintained, 
they are useful also as a standard against which the user 
may cc»npare his cwn work. 

(2) Functional Test Routines for Compilers (of Widely Used 
Languages) : These are sets of programs v*u.ch determine 
whether or not a subject ccntpiler provides specific capa- 
bilities, and, in particular, vfliether a certain "standard 
subset" of the language is compiled in an acceptable way. 
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Well structured and medntained functional test routines can 
constitute the basis for de-facto language standards, and 
as such are especially ionportant and interesting. Func- 
tional test routines currently exist for COBOL and FORTRAN 
coirpilers, the former being a part of the Government's defi- 
nition of standard OOBOL. It would be both appropriate and 
convenient to provide these functional test sets over a dis- 
tributed network. 

It is unsettling, however, that the vitally in^xartant 
problem of establishing (mininum) standards for compiler 
diagnostics has been hitherto ignored. Since the efficaoy 
of a compiler is directLy proportianal to the quality of 
its diagnostics, i.e., to -the amount of infozmation the 
oonpiler supplies concerning the nature and location of 
unacceptable code, it would be wortht^iile to develop, in 
addition to the set of functional test routines, a set of 
standards for conpiler diagnostics. Within this context, 
-the diagnostic standards could taJce ihe form of a set of 
programs with specific em>rs in them. To meet all stand- 
ards, then, a oonpiler would have to process both the 
functional test routines and liie deliberately incorrect 
programs in an acc^able manner. 

U.1.2 Fully Compatible Intei'pi eL eis and Compilers 

Much of the progranming activity on a distributed network will be 
done in an interactive conversational mode. It is inportant, there- 
fore, to provide tools v^iich siqjport interactive progrBm production, 
debugging, modification, and testing. In particular, it is convenient 
to coirpcse a program at a terminal using an interpreter v^ch can 
field brtaJcs or errors within a computation, evaliiate arbitrary ex- 
pressions during breaJcs or at the top level, provide a trace of the 
values of specified variables from 1he breakpoint back through the 
ccnputation, and permit the prQgramner to modify or cancel -the effects 
of the current oonnend, thus recovering an earlier state. When the 
code has been debugged, however, it may be desirable to compile it, 
perhaps using an optimizing compiler (e.g., if it is a production pro- 
gram iMch will be executed frequently, or if, as in a theorem- 
proving prog ram , even a single execution is expected to be very time 
consuming). By offering fully compatible interpreters and conpilers, 
then, the network can provide its users with a rich and flexible 
environment for progranming. 

U.1.3 Automated Software Testing Tools 

The tasks of debugging, modifying, testing, documenting, and, in 
general, understanding the logical structure of a program are greatly 
facilitated by the use of software testing tools. Ihere are two nain 
categories of analysis: static analysis, vMch is performed without 
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executing the software, and dynamic analysis, v^ch is dependent \jpon 
infoniation collected Mle -die softtgare is in execution. 

U. 1.3.1 Static Analyzers: Ihese software tools accept a 
subject program as input and produce the following type of in* 
fomatian as output: 

(1) a displ^ of the progr-am structure and logic flow; 

(2) a description of the global data, i.e., I^e data vMdi 
is shared among the subroutines; 

(3) a subroutine/global variables referenced listing; 

(U) a global variable/subrcxrtines ^ere referenced listing; 

(5) a subroutine/subroutines referenced listing; 

(6) a subroutine/subroutines \ihere referenced listing; 

(7) an entry point/subroutine listing; 

(8) a subroutine/entry points listing; 

(9) a description of "tbe discounted portions of code, 
i.e. , code vMch cannot be reached from the 'start* 
state; 

(10) a description of the blocked portions of code, i.e. , 
code from vhidh an 'exit* state cannot be readied. 

Other tools have been suggested v^ch analyze the possible 
execution paths of a program, and output a (hopefully minimal) 
subset of paths vMch exercise every statement and/or branch 
option in -die progr m ([3], [6]). lliese potential path analyzers 
can also identify execution paths which include a particular 
instruction or sequence of instructions. This infomation is 
extremely valuable -to a progr^nniKr in constructing a set of test 
oases -that will -thorou^y exercise his code. The major challenge 
in developing potential path analyzers is finding some appropriate 
way to deal with the enonnous nunnber of possible execution paths 
of even relatively simple programs. Perhaps modularly designed 
structured pro giaaa offer some promise in this regard: if each 
module is analyzed independent of the others, and then the flow 
from module to module is considered, the conbinatorial problem 
will be eased. 

U.1.3.2 Eynamic Analyzers: There are software tools which, 
by inserting traps in the subject progron, cause the following 
types of infomation to be produced in addition to the prog ra m* s 
normal output: 
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(1) the nvmber of tijnes each statenient in the program has been 
executed in a single run or series of mns; 

(2) the miiiber of tiroes each transfer in -tbe program has been 
executed in a single xvn or series of runs; 

(3) the nuiiber of tiroes each subroutine in -die program has 
been entered during a single run or series of runs; 

(U) the amount of time spent in each subroutine during a 
single run or series of rjnsi 

(5) for each statement assigning a ne»7 value to a specified 
variable, the maxinun, mininun, first, and last value 
assigned during the ocniputation. 

Ihe operation of a dynamic analyzer is shown schematically 
in Figure 3. 
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Figure 3: Dynamic Analyzer 
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Because the pixjgraniner can now accurately analyze the effective- 
ness of his test cases, i.e., he knows how nary times each 
statement (transfer, subroutine) has been exercised, and, in 
particular, he kncws v*uch statements (transfers, subroutines) 
have never been exercised, he can construct a set of test cases 
that is both thorough and mininally redundant. Regression test- 
ing, required during validation and maintenance phases, is sijrpli- 
fied as well. When a portion of code hcis been altered (corrected, 
improved, etc.), those test runs involving -tihe changed code, i.e., 
the set of tests that inist be re-evaluated, is readily identified. 

Although the traps inserted by the dynamic analyzer will 
usually be removed before the program begins 'normal' operation 
(the traps introduce considerable overhead in both space and 
time), it may sometimes be desirable to leave them intact for 
a v*iile. For exairple, if a program is to be optimized, it is 
extremely iji^xjrtant to know v^ch portions of code are repeatedly 
executed during nomal operation. Snail improvements in these 
will result in a significantly more efficient program. Con- 
versely, if a portion of code is executed only larely, it migjit 
not be worth^diile to bother optimizing it at all. In a similar 
vein, a precise description or the noinBlly mnning program in 
terms of the types of instructions executed, hunter of calls 
made to specific system routines, time spent performing certain 
functions, average running time, etc., is essential if an 
accurate model of the program is to be built. 

It should be noted that static and ctynamic analyzers accept 
a program written in some (hi^ier level) language A as input, 
and output a detailed program description, or another (augmented) 
program in language A, respectively. Theoretically, then, a 
single set of these tools could be useful for language A pro- 
grams limning anywhere on the network. 

U.l.U Verification (bndition Generators and Theorem Provers 

For some programs, such as programs vMch deploy nuclear weapons, 
handle edr traffic control, or control access to ultra-sensitive files, 
testing is not sufficient. Testing a program thoroughly serves to in- 
crease confidence in its reliability. However, no set of test cases 
(short of an exhaustive list of all possible inputs) will ever guarantee 
correctness in any mathematical sense. .A rigorous proof consists of two 
separate but related tasks: 

(1) Given the subject program together with certain additional 
informatian (assertions over the program variables) prxjvided 
by the programner, generate a set of potential theorems, 
the proof of which ensures the correctness of the program. 
The potential theorems are called verification conditions. 

(2) Prove each of the verification conditions. 
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The overall pirocess of proving that a program is oon?ect is de- 
picted in Figure U. 
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Figure U: Proof of Program Correctness 



The verification condition generator accepts the ptrogran and the assert 
tions as input, and, using a semantic definition of programning 
language 9 generates verification conditions. Each verification condi- 
tion (many, but not all of tdiich are trivially simple to prove) is 
proven manually, i.e., by a hunan, or autonatically, i.e., by a 
tt^oron-proving program. Some ijipsrtant research (which has inplica- 
tions in the areas of programning language design, and overall system 
design) in tiie area of hand-generated proofs has been done by R. L. 
London [9], [10], C.A.R. Hoare [5], and others. Since these proofs can 
be lengthy and tedious, however, they are subject to error in nuch the 
same way as -tiie original progrEon was. For this reason, and because 
proving the correctness of large progre a ns involves proving nary verifi- 
cation conditions, the concept of nachine-generated proofs is appealing. 

Ihe principal obstacle to provii^ prognsms correct automatically 
lies in the fact that all cun?ent theorem-provers are inefficient. 
MDSt of the inferences they generate turn out either to be irrelevant 
to the proof which is eventually produced, or to provide less infoma- 
tion than aii inference generated previously. For most interesting 
problems, all availahie resources (i.e., time and space) are exhausted 
before a proof can be constructed. Various strategies— some involving 
the interactive intervention of human intelligence at key points in the 
proof process— have been devised in an attempt to increase the effi- 
ciency and effectiveness of theorem-provers. It would be a major con- 
tribution to this research if a "program-proving facility" were made 
available over the network. Since neither verification condition 
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generators nor theorem-provers are available at most installations, this 
facility would include modules enibodying verification condition genera- 
tors (for each of several pirograniniiig languages) , basic inference 
generators, and a broad x^nge of proof strategies. Additional modules 
would be incorporated into the facility by interested users as they are 
developed. Given such a facility, new prognam-proving systems could be 
tuilt by re-configuring the various mcxiules. Since experts in this 
field are widely sepaxHted geographically, a network program-proving 
facility would serve also to prcmote rapid up-to-date comnunication and 
software sharing among them. Moreover, the network can offer a variety 
of hardware not available at any single research installation (e.g. , the 
associative processor v^iich my be available over the AKPA network) . 

H.l.S On-Line Documentation and Interactive Help Routines 

On-line doomentation capabilities and extensive interactive help 
routines are particularly appropriate in a networking enviroiment, where 
many continuously changing facilities are being shared by a broad 
spectrum (in terms of experience and interests) of users. The docunen- 
tatioi aids serve to familiarize a user with the software and to 
identify recent changes made to it; help routines assist him in diag- 
nosing and correcting problems he encounters in using the software. To 
maximize its utility, programmers should be able to determine (e.g. , by 
choosing among several options) the quantity, level of detail, and, 
vherever appropriate, the format of the documentation he requests. 
Help routines should be flexible as well. For exanple, the "standard" 
prelude should be emitted at the user's request. 

In systans relying on hard-copy oamunication of changes (e.g., 
newsletters, updates to systems manuals), documentation lag is an in- 
herent and unavoidable characteristic. An especially val\jable feature 
of on-line docimentation, therefore, is that it can always be kept 
current (provided system changes are nade together with, and not prior 
to, changes in the on-line documentation). Within this OMitext, then, 
the user should be able to request information concerning the current 
status of the system, for example, 

(1) a list of eill changes made to a particular system during 
the last week (month, day, etc.); 

(2) a list of all corrections made in response to coi^laints 
initiated by the user; 

(3) a list of all corrections made to a particular module 
within a system, etc. 

Note that this information is readily obtainable from the Operational 
Errors File and the Corrections File. 
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In the absence of any direct and objective measure, software 
quality and reliability (or the lack "Oiereof ) can be gauged only in 
terros of naif unctions that occur, e.g., mean time between feiilures, 
mean time to repair. Ihe Opea:>aticml Errors File and the Corrections 
File, therefore, constitute a fonnal mechanism by which to measure "the 
reliability of network software and assess the value of new software 
features or approaches. That is, by using new techniques to build 
some network software package, and then carefully analyzing the Opera- 
tional Errors and Corrections Files (to determine where and how Ihey 
differ frcm 'die Files ccsociated with similar network software pack- 
ages -diat do not involve ±he new technique) , it migjit be possible to 
assess the ijipact and efficacy of the new technique. We caution 'diat 
factors vrfiich are either difficult to measure or are unknown or both 
(e.g., Ce^ability of "die prograniner) will be reflected in the Files, 
and -that -therefore, any ocnclusions dram from "the Files will have to 
be based on very gross changes in such things cUB frequency of errors, 
etc. With this reservation in mind, we now describe experiments in- 
volving Ihe concepts of structured progranming and systematic testing 
of programs (using software testing tools), both for their inherent 
scientific value and also as illustrations of Ihe type of research 
that mi^t be performed using the ne-brork and its Files. 

H.2.1 Structured Progranming 

Software vMch is built ftan a very limited and well-defined set 
of control structures is thought to be more reliable than conven- 
tionally written (i.e., unstructured) code [H]. The argunent is that 
progranmers use their unrestricted GO-TO ri^ts to construct an in- 
tricate naze of arbitrary transfers, directing control helter-skelter 
through Ihe program and thereby obscuring Ihe underlying logic. The 
execution characteristics of such a program are extremely difficult to 
analyze and the progranner is unlikely to know exactly what is going 
on. On the other harxl, if the p ix?gi ^ is built as a hierarchy of 
modules, and if strict rules are enforced governing the transfer of 
control within and between modules, the logic will be much more ex- 
plicit, and Ihe program should be sin^ler to \jnderstand, document, 
debug, test, and maintain. The oonputational coirpleteness of certain 
restricted classes of control structures has been proven by Bohm and 
Jaoopini [2], and Kosaraju [3]. 

By analyzing the Operational Errors File and the Corrections File 
of a "structured" compiler offered on the network, and comparing 1hem 
to the Files of conventionally vn?itten compilers, we mig^t be able to 
address the follcwin« types of questions. 

(a) Is structured progranming worthMhile with respect to 
reliability, e.g., are Ihere fewer, less serious errors 
in structured programs? 
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(b) Are \hr types of errors that occur in structured progrEuis 
different fron Ihose occurring in unstructured ptrogroms, 
and can -they be more easily detected axid/or avoided? 

(c) Are structured progTEuns easier to itaintain, and are they 
less sensitive to modification, i.e. , after program modi- 
fication, do fewer or less serious errors occur in 
structured progrcons? 

U.2.2 Systenatic Testing of Software 

tfrvtil it beccmes practical to prove correctness far lai^e progr^uns 
in a mathematically rigorous way, testing will be an i]i;)oirtant phase of 
software develppment. However, since even simple software packages nay 
have an infinite iiqjut dcxnain and an extraordinarily large nundber of 
execution paths, it is iirpossible to test a p rogr « n under all con- 
ceivable running conditions. Current practice is to design and imple- 
ment a system, and then to test it for sane arbitrary subset of 
possible input values and envirorroental oonditicns. Ihe program is 
accepted v^en it executes these test cases correctly. Ifawever, there 
are u8Ucd3.y a significant nurber of residual errors. The user uncovers 
these errors in the course of operation, when the software fails to run 
for certain inputs, vdien the computed results are clearly incorrect, or 
when the softvare reacts with its env i rorme n t in unescpected and unde- 
sirable Mays, The cost to the user is substantial. 

Hie hi^ error content of developed and tested software is not 
due to poor workmanship on the part of the developers and testers, but 
rather to the lack of techniques for dealing adequately with the com- 
plexity of large computer programs. In particular, 

(1) the developer cannot accurately measure the effective- 
ness of a particular test; 

(2) the developer cannot determine v^ether his set of test 
cases has thoroug^y exercised the software. Moreover, 
he cannot specify particular paths in the software vMch 
have never been exercised; 

(3) current software packages are so coirplex that a thorough 
manual analysis of the test space is not feasible. 

Dynamic analyzers (as described in Section U.1.3.2) have been 
proposed as a means of dealing more effectively with complex software 
logic. The information provided by dynamic analyzers, e.g., which 
statanents are executed, vMch branches are taken, which subroutines 
are entered and in vdiat order, forros a basis for defining and construct- 
ing a set of test cases vMch thoroughly tests a program. Several 
definitions of a "thorough set of test cases" come to mind, for ex- 
ample, a set viiich exercises every statement in the program at least 
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cnoe, or a set vMch causes the executicn of every branch in the pro- 
gram. Having tested a network progrenn "thoroughly," we can ccnpare 
it to simlar programs tested ad-hoc and attaurt to answer the f ollcw- 
ing types of questions: 

(1) Are "thorou^iLy" tested programs more reliable, i.e. , do 
they have fewer, less serious errors? 

(2) How is testing thoroughness related to reliability, i.e. , 
are there degrees of thoroug^iness in testing, and are these 
indicative of the reliability of the program? 

(3) Is regression testing (performed after modifying the soft- 
vgare) easier (i.e. , faster, cheaper) when testing tools are 
used, and are the re-tested programs more reliable than 
modified programs tested ad-hoc? 

(U) Are oertain p rogra ms , for exanple, stnoctured programs, 
more "testable" than others, i.e., does it take fewer test 
cases to thoroughly test them, or are the test cases more 
easily constructed? 

U.2.3 Rirther Analysis of the Operational Errors 
and Corrections Files 

We have already suggested how the Operational Errors and (Correc- 
tions Files can be used to measure the reliability of network software, 
the diligence with vMch network software is being maintained, and the 
effectiveness of new software techniques. We now suggest that the 
data collected in these files is useAil in itself. One major obstacle 
to software quality research has been the lack of hard data concerning 
errors, e.g., v^t causes them, which type of errors occur most (least) 
frequently, cause the most (least) serious malfunctions, etc. 

Given this data—^^ch is precisely the data collected in the Opera- 
tional Errors and Corrections Files— it mi^t be possible to categorize 
software errors, and to deteamine how each class of errors could have 
been avoided, for exanple: 

(1) by using a modified version of the prograimdng language, or * 
a diffei>ent language altogether; 

(2) by writing structured programs, or abandoning t^ concept; 

(3) by using "standard" library versions of frequently needed 
routines, rather than re-inventing (and re-<lebugging) them 
each time; 

(U) by enploying mathematical proof concepts an a broader sccde; 

(5) by systamtic testing using autonated software testing 
tools; 
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(6) by tising more ^ynemdc range and data bound checks, and 
cpticnally ocnpilable assertions (i.e., run-tiaie tests), 
or by distributing them differently. 

5. SU^flAFY AND KEOOrtCtjMTIONS 

This report has described mechanisniB for error coUecticn and 
analysis — the System Perfonnanoe Profile, and the Operational 
Errors and Corrections Files — with two gisals in ndnd. The first is 
to provide a basis for measuring netwra* reliability (in terms of the 
frequency of errors, or the frequency of user oonplaints, or the 
amount of system dcwntime, etc.) and maintenance quality (in texms of 
tlie delay between emsr i >epai' t8 and implemented oorros tions, or the 
muter of new errors introduced in the course of nodi^ing the 
system, etc.). Network standards for system reliability and nainte- 
nance night, therefore, be established and enforoed. Ihe second goal 
of the Files is to facilitate researdi in the area of software 
qiiality, v*u.di has to date been crippled by a paucity of hard data 
concerning the nature of software errors in IcB^ systenfi. No one 
has yet been able to analyze and categorize software errors, deter- 
mine their frequency of occurrence, and then suggest ways to identify 
and/or avoid them. The data collected in the Files of a large dis- 
tributed netWM* should be valuable in this type of endeavor. 

Moreover, by carefully monitoring and analyzing changes in the 
Operational Errors and (kxrrections Files, the efficacy of new soft- 
ware techniques might be assessed. Experiments were suggested to 
evaluate the utilit7* of structured prognamning and of systematic 
software testing. Finally, the possibility of creating a "software 
pa^oduction laboratory" on the network was addressed, and specific 
facilities for st^porting such a concept were suggested. These in- 
cluded compatible interpreters and oompilerB, a wide r«mge of verifi- 
cation condition generators and theorem-prwars, autonated software 
testing tools, and extensive on-line documentation and interactive 
help routines. 
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Product Standards* Provide requirements for sizes, 
types, quality, and methods for testing various indus- 
trial products. ^These standards are developed co- 
operatively with interested Government and industry 
groups and provide the basis for common understand- 
ing of product characteristics for both buyers and 
sellers. Their use is voluntary* 

Technical Notes. This series consists of communi* 
cations and n^porls f covering both other-agency and 
NBS-sponsored work) of limited or transitory interest* 

Federal Information Processing Standards 

Publications. This series is the official publication 
within the Federal Government for information on 
standards adopted and promulgated under the Public 
Law 89-306, and Bureau of the Budget Circular A-S6 
entitled, Standardi'/ation of Data Elements and Codes 
in Data Systems* 

Consumer Information Series. Practical informa- 
tion, based on NBS research and experience, cover- 
ing areas of interest to the consumer* Easily under- 
standable language and illustrations provide useful 
background knowledge for shopping? in today*s tech- 
nological marketplace* 
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BIBLIOGRAPHIC SUBSCRIPTION SERVICES 

The following currcnt*awarcnM0 and literature-survey bibliographies are issued periodically by the 
Bureau: 

Cryogenic Data Center Current Awareness Service (Puhlicatiom and Report!^ of Interest in Cryogenics). 
A literature survey mued weekly. Annual subscription: Donu'stic, $20.00; forcifrn. $2.5.00. 

Liquefied Natural Gas. A literature sur\ey issued t)uartk*rly. Annual subscription: $20.00. 

Superconduetinir Devices and Materials* A literature survey issued quarterly. Annual subscription: $20.00. 
Send subscription orders and remittances for the preceding bibliographic 5er\'ices to the U.S. Department 
of Commerce, N itional Technical Information Service, Springfield, Va. 22151. 

Electromagnetic Metrology Current Awareness Service (Abstracts of Selected Articles on Measurement 
Techniques and Standards of Electromagnetic Quantities from I)-C to Millimeter- Wave Frequencies). Issued 
monthly. Annual subscription: $100. no (Special rates for tnulti*subsrriptions). Send subscription order and 
remittance to the Electromagnetic Metrology Information Center, Electromagnetics Division, National Bureau 
of Standards, Boulder. Colo. 80302. 

Order NBS publications (except Bibliographic Subscription Services) 
from: SujXTintendent of Dociitnents, Government Printing Office, Wash- 
ington, D.C. 20402. 
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